On Optimizing a Class of Multi-Dimensional Loops with Reductions for Parallel Execution
نویسندگان
چکیده
This paper addresses the compile-time optimization of a form of nested-loop computation that is motivated by a computational physics application. The computations involve multi-dimensional surface and volume integrals where the integrand is a product of a number of array terms. Besides the issue of optimal distribution of the arrays among the processors, there is also scope for reordering of the operations using the commutativity and associativity properties of addition and multiplication, and the application of the distributive law to significantly reduce the number of operations executed. A formalization of the operation minimization problem and proof of its NPcompleteness is provided. A pruning search strategy for determination of an optimal form is developed. An analysis of the communication requirements and a polynomial-time algorithm for determination of optimal distribution of the arrays are also provided.
منابع مشابه
Loop Transformations for Parallel Execution of a Class of Nested Loops on Shared-Memory Multiprocessors
Computationally intensive multi-dimensional integrals involving products of several arrays arise in some computational physics codes modeling electronic properties of semiconductors. This paper develops a framework for optimizing the parallel execution on shared-memory multiprocessors, of a class of nested loop computations motivated by this application domain. The framework addresses the selec...
متن کاملSpeculative Parallel Execution of Loops with Cross-Iteration Dependences in DSM Multiprocessors
Speculative parallel execution of non-analyzable codes on Distributed Shared-Memory (DSM) multiprocessors is challenging due to the long-latency and distribution involved. However , such an approach may well be the best way of speeding up codes whose dependences can not be compiler analyzed. In previous work, we suggested executing the loop speculatively in parallel and adding extensions to the...
متن کاملScientific Flow Field Simulation of Cruciform Missiles Through the Thin Layer Navier Stokes Equations
The thin-layer Navier-Stokes equations are solved for two complete missile configurations on an IBM 3090-200 vectro-facility supercomputer. The conservation form of the three-dimensional equations, written in generalized coordinates, are finite differenced and solved on a body-fitted curvilinear grid system developed in conjunction with the flowfield solver. The numerical procedure is based on ...
متن کاملA Runtime Framework for Optimizing Multi-Dimensional Array Accesses on Multi-core Processors
Scientific and numerical applications rely on multidimensional array data accessed in nested loops. These regular data access patterns can benefit from explicitly managed local memories, such as the local stores of the Cell processor. We present Strider, a runtime library framework which helps programming and optimization of multi-dimensional data accesses in nested loops, on multi-core process...
متن کاملAchieving Full Parallelism Using Multidimensional Retiming
Most scientiic and Digital Signal Processing (DSP) applications are recursive or iterative. Transformation techniques are usually applied to get optimal execution rates in parallel and/or pipeline systems. The retiming technique is a common and valuable transformation tool in one-dimensional problems, when loops are represented by data ow graphs (DFGs). In this paper, uniform nested loops are m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Parallel Processing Letters
دوره 7 شماره
صفحات -
تاریخ انتشار 1997